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We present an exact Monte Carlo algorithm designed to sample theories where the energy is a sum of many 
couplings of decreasing strength. The algorithm avoids the computation of almost all non-leading terms. Its 
use is illustrated by simulating SU(2) lattice gauge theory with a 5-loop improved action. A new approach for 
dynamical fermion simulations is proposed. 



When sampling the partition function Z = 
fJ\dU e~ H " u >', the most common algorithm 
is that of Metropolis: at each step, starting from 
the configuration {£/}, a candidate configuration 
{U 1 } is proposed, and it is accepted with prob- 
ability P acc = min(l,e-( H ({ u '»- H ({ u »)). This 
acceptance test is realized by comparing P acc to 
a random number uniformly distributed in [0,1]. 
This seems like an excess of information: why 
compute H({U'}) exactly, then compare it with 
a random number? It should be sufficient to esti- 
mate it. This logical proposition was studied long 
ago |l|,||. The main difficulty which prevented 
the construction of an efficient algorithm at that 
stage was the existence of probability bound vi- 
olations for the noisy estimator of P aC c, which 
caused intolerable systematic errors. This diffi- 
culty was overcome in ||. Ref.(|], however, in- 
troduces an infinite number of auxiliary variables, 
and tests of the method are performed on a toy 
model with 5 degrees of freedom only. Here, we 
simplify the method of || , by introducing only 1 
auxiliary variable per term in H. Moreover, we 
separate H into a leading part to be calculated 
exactly, and a sum of correction terms, which we 
treat stochastically. 

Consider a generic Hamiltonian of the type 



*A more detaile d d escription of the NMC algorithm can 
be found in Ref.Jll) 
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H = I2fc=o c fc^fc> wnere as k increases, \ck\ de- 
creases. It is often the case that one would 
like to study a Hamiltonian of such type re- 
sulting from an expansion, be it perturbative 
||, non-perturbative ||, or based on the fixed 
point of a renormalization group transformation 
As k increases, the number of geometrically 
equivalent terms grouped into Wk increases expo- 
nentially. This combinatoric explosion normally 
makes the simulation of extended Hamiltonians 
prohibitively expensive. However, in most cases 
the couplings Ck decrease exponentially with fc, 
so that the overall Hamiltonian is dominated by 
Wo ■ By making use of stochastic methods to es- 
timate the correction terms Wk,k > 1, we aim 
at postponing the combinatoric explosion of the 
simulation costs incurred when including higher 
terms Wk- This opens the possibility of studying 
numerically much more complicated Hamiltoni- 
ans including higher-order correction terms. 

By shifting the Wk's by irrelevant constants, 
one can arrange that the terms cuWk be nonpos- 
itive starting from k = 1: CkWk(U) < VU. 
The contribution of the terms Wk{U),k > 1 is 
estimated stochastically by introducing auxiliary 
fields Cfc, which can take two values: and 1. 
We represent the probability e~ H in the form 
P [U]*Pi[U,a], where 

P [U] = e - c ° Wo ^ 
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P 1 [U,a] = H 



k=l a k =0,l 
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Po[C/] * Pi[U, cr] can be interpreted as the joint 
probability distribution for the original fields of 
the model and the new a fields. 

One can easily see why the introduction of aux- 
iliary a fields can be useful. Starting from the 
configuration {Ui,a}, a candidate configuration 
{U2} distributed with the weight -Po[^2] is pro- 
posed, and accepted with probability 

f Pi[U 2 ,q 
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Since the terms CkWk{U) contribute in P acc only 
if (Jk = 1, the amount of computational work is 
greatly reduced if the configurations with Ok = 
are dominating. That is certainly the case when 
the coefficients \ck\ are small: the probabilities 
for <Jk to be unity are negligible then: p ak =i = 
l_ e c k w k (u) K \ CkWk (U)\ wO if c fc «0. 

We illustrate these ideas on a 5-loop perturba- 
tively improved SU(2) gauge model in 4d: 
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where (m^m) = (1, 1), (2, 2), (1, 2), (1, 3), (3, 3) 
for i = 1, . . . , 5 denote the planar, fundamen- 
tal loops of size m x n. The Gibbs factor is 
exp(— §<S). One can construct a one-parameter 
set of actions which have no 0(a 2 ) and C(a 4 ) 
corrections 

Cl = (19 - 55 c 5 )/9, c 2 = (1 - 64 c 5 )/9, 

c 3 = (-64 + 640 c 5 ) /45, c 4 = 1/5 - 2 c 5 (4) 

Here we take C5 = 1/20 and /3 = 2.4. We estimate 
the contribution of all loops except the plaquette 
stochastically. For each loop I of sort 2 < i < 5 
we introduce the auxiliary variable <7i{l) — 0,1; 



and rewrite the contribution of this loop to the 
Gibbs factor in the form 



s ' j = ^2 [<W),o+<ki(o,i( e ' 

(T< (0=0,1 
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After each iVj updates of the fields C/ we update 
the a fields of sort i. We measure the average 
values of cr,, as listed in Table |[ They are quite 
small, so one avoids the computation of almost 
all of the extended "staples" in the U update. 

Table 1 

Average value of &i field for each loop of sort i. 



loop 


1x2 


1x3 


2x2 


3x3 


(<*) 


0.0753 


0.0199 


0.0202 


0.0018 



The efficiency of NMC is estimated by com- 
paring it with the non-noisy updating procedures 
(heatbath, overrelaxation) which are commonly 
used for the simulation of actions like (||).We label 
these usually applied techniques with the collec- 
tive name "Usual Monte Carlo" (UMC), to con- 
trast with NMC. 

For NMC the average computational cost per 
one update of the U fields on the entire lattice in 
units of matrix (link) multiplications is 
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where t^j is the update cost for the elementary 
plaquette action, Pj is the perimeter of loop i, 
Si is a symmetry factor: Sj = 1 for square loops 
and Si = 2 for rectangular loops (see Ref. [jll| for 
details). The computational cost for UMC per 
U update is approximately equal to the r.h.s. of 
expression (^|) in the limit Ni — ► 00 and (<jj) —> 1: 
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The naive gain in efficiency from using NMC is 
given by the ratio between the costs (^) and (||): 



.pi 



gain 



(8) 
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Table 2 

Integrated autocorrelation times for average loop traces in units of U updates for the 
(first column), and the NMC algorithm with different frequencies of a updates for 1x2, 
loops (other columns). The last row presents the naive gain for the NMC algorithm (0) 
given by eq.(fif) and depends on the observable. From the values of r jn t below, the real 



UMC algorithm 
1x3, 2x2 and 3x3 
. The real gain is 
gain is 0(4 — 6). 



number of U 


UMC 


1 


5 


5 for 1x2 


10 


10 for 1x2 


20 


30 


40 


updates per 


no 


for 


for 


15: 1x3,2x2 


for 


30: 1x3,2x2 


for 


for 


for 


1 <j update 


a 


all 


all 


105 for 3x3 


all 


210 for 3x3 


all 


all 


all 


Tint (lxl) 


0.7(1) 


1.9(2) 


2.3(1) 


2.5(2) 


3.1(2) 


3.2(2) 


4.3(4) 


4.5(4) 


3.8(2) 


T in t (1x2) 


0.8(1) 


2.6(3) 


2.8(2) 


3.2(2) 


4.3(4) 


3.9(3) 


5.2(4) 


5.6(5) 


5.7(4) 


Tint (1x3) 


0.8(1) 


2.7(3) 


2.8(2) 


3.2(2) 


4.3(4) 


3.9(3) 


5.1(4) 


5.4(5) 


5.4(4) 


Tint (2x2) 


1.0(1) 


3.4(5) 


3.3(3) 


3.7(3) 


4.7(4) 


4.7(3) 


5.3(4) 


5.8(6) 


5.7(4) 


Tint (2x3) 


1.4(3) 


4.2(7) 


3.8(3) 


4.2(3) 


5.5(5) 


5.7(4) 


5.7(5) 


6.2(6) 


6.3(5) 


Tint (3x3) 


1.8(4) 


5.0(8) 


4.5(5) 


5.0(5) 


5.8(6) 


6.4(5) 


5.9(5) 


6.3(6) 


6.3(5) 


r naive 
gain 


1 


7.2 


14.8 


18.3 


17.9 


20.6 


18.3 


20.8 


21.1 



One should also take into account the increase of 
autocorrelation times coming from the introduc- 
tion of variables a in the NMC algorithm, so the 
real gain is 

^UMC 



„real 



naive 
1 gain 



.NMC 



(9) 



where the second factor depends on the observ- 
able under consideration. 

It is not practical to keep the same updating 
frequencies 1/iVj for all sorts i of loops. In order 
for the work in the a and in the U updates com- 
ing from loops of sort i to remain comparable, 
one should keep the updating frequencies 1/iVj 
proportional to (er,): 
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Pi(tTi) 



(10) 



Due to the small influence of weakly coupled 
terms on the dynamics of the system, one can 
expect only insignificant changes in the autocor- 
relation behavior as JVj increases. 

Table ^ gives an impressive demonstration of 
the benefits which come from using the NMC al- 
gorithm. For the runs where the updating fre- 
quencies for a fields are adjusted as per eq.([Io]) we 
infer that the 'real gain' in computer time is of or- 
der 0(4— 6). One can expect a much greater gain 
for more complicated highly-improved actions. 

Let us now speculate on possibilities to use our 
algorithm to simulate a Hamiltonian with a very 



large number of terms. A specific example we 
have in mind is the case of full QCD, where the 
measure is, for 2 flavors of Wilson quarks: 



- e - s ^ det 2 (l - kM{U)) 

z 



(11) 



where S g is the local gauge action, M(U) is a 
hopping matrix connecting nearest neighbours on 
a 4d hypercubic grid, and Z normalizes the dis- 
tribution. The determinant can be turned into 
exp(Tr(Log(l - nM(U)))), then the logarithm 
expanded around 1, giving the loop expansion of 
the measure above: 



z 



(12) 



TrM(U) 1 can be represented as a sum over all 
closed non-backtracking loops of length I on the 
4d hypercubic lattice. The number of types of 
contributing loops ni is bounded by 7 l . Although 
this upper bound is not saturated, it is clear that 
the multiplicity of terms of a given length I grows 
exponentially: 



ni ~ Fi(l) a 1 ] 



a < 7 



(13) 



where F\(l) is a rational function of I, and a 1 
is the leading exponential behavior of the num- 
ber of loops of length I in the limit of large I. At 
first sight, it seems that sampling numerically the 
distribution ( |l2] ) is a disastrous idea: the action 
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contains an infinite number of terms, of exponen- 
tially growing multiplicity. Instead, other strate- 
gies are being used, based on the transformation 
of the determinant (|ll|) into a Gaussian integral. 

Nevertheless, the coupling 4- decreases expo- 
nentially as I increases. Therefore, the auxiliary 
variables cr; associated in our approach with var- 
ious loops of length I will take value almost 
always. One gets: 

(m) ~ a ~ F 2 (i) fcV (14) 

where the exponentially growing factor -f l comes 
from the average trace of Dirac matrices along 
the loops of length I. 

If one arranges the updating frequencies for 
each loop i as per eq.(|i"c|), one can expect that 
the average computer time needed for estimating 
the contribution of all loops of length I behaves 



ti ~ nil 2 {a{) ~F(l)*(a 1K ) 1 



(15) 



For k < K ca — -1- the computational cost ti de- 
creases exponentially with I and the total compu- 
tational cost of the algorithm t = J2i U converges 
to some finite value. Our rough estimation from 
fitting ni and average trace of Dirac matrices in 
the interval 4 < / < 12 gives a m 5.4; 7 « 1.4, 
and therefore n ca « 0.13. 

In the regime k < n ca we are in an interest- 
ing situation where the influence of very large 
loops is negligible because their associated cou- 
pling in the effective action is extremely small. 
Therefore, truncating the loop expansion above 
a certain order will introduce a statistically un- 
observable bias. Equivalcntly, one can freeze the 
associated a variables at the value zero, or update 
them with arbitrarily low frequency. In spite of 
this extremely (or infinitely) slow dynamic mode 
of the cr's, the dynamics of the gauge fields are 
not affected. Note that the cost of our algorithm 
grows linearly with the volume V of the system. 
This is better than alternative approaches to the 
simulation of full QCD: Hybrid Monte Carlo (cost 
oc V 5 /*) and MultiBoson (cost cx V(LogV) 2 ) §]. 
In addition the stepsize, or typical change at each 
update of a gauge link U, does not seem restricted 
a priori for small quark mass, unlike in the two 
alternative approaches above. 



A less speculative use of our algorithm for full 
QCD consists of truncating the loop expansion 
eq.([l2|) to some order l max , and representing the 
higher orders with the MultiBoson approach JjiJ ■ 
This strategy, called "UV-filtered MultiBoson", 
has already been used successfully 0). However, 
111 Ref.@ the loop expansion is truncated to its 
lowest term I = 4, because the exact evaluation 
of larger loops is too time-consuming. With our 
stochastic approach, these larger loops can be es- 
timated at low cost. We expect this composite 
strategy to be particularly efficient. 
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